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ABSTRACT 

This paper presents a synthesis of the idear and 
issues developed at a conference convened to review the results of 
the Dewey Decimal Classification Online Project and explore the 
potential for future use of the Dewey Decimal Classif icati?:* (DDC) 
and Library of Congress Classification (LCC) schedules in online 
library catalogs. Conference discussion centered around the themes of 
subject search enhancements for the next generation of online 
catalogs, the future role of class number searching in online 
catalogs, and the feasibility of using machine-readable LCC and DDC 
schedules online. Six broad conclusions for the future are outlined: 
(1) all operational online catalogs should include the subject search 
features that have already proven necessary; (2) subject search 
strategies should be explored; (3) it is worthwhile to build the DDC 
into a classification authority file, available in machine-readable 
form for use as a cataloger's tool and in online catalogs; (4) using 
the Dewey Online Catalog (DOC) as a prototype, it is worthwhile to 
continue to refine the design of a DDC online catalog; (5) displays 
of related terms will be a valuable search enhancement in future 
online catalogs; and (6) LCC will eventually be made 
machine-readable • (KM) 
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INTRODUCTION 

In January 1986, Forest Press, the Online Computer Library Center (OCLC), 
and the Council on Library Resources (CLR) brought together 30 librarians and 
information scientists working in the areas of classification, online subject 
access, and online library catalogs. The purpose of the invitational conference 
was twofold: (1) to provide an opportunity to review the results of a highly 
significant research undertaking, the Dewey Decimal Classification Online Project; 
and (2) to explore the potential for future use of the Dewey Decimal (DDC) and 
Library of Congress (LCC) classification schedules in online library catalogs. As 
background for the meeting, the participants had the more than 500-page final report 
of the DDC Online Project 1 and a brief description of a project using DDC numbers for 
online access in 0HIONET member libraries. In addition, several presentations 
were made at the conference. 

During the course of the lively one and one-half-day meeting, the 
participants 1 discussion ranged around three themes: 

1. What subject search enhancements should we work to incorporate into 
the next generation of online catalogs? 

2. What is the future role of class number searching in online catalogs? 

3. How can we use machine-readable LCC and DDC schedules online? 

The following paper is intended to present a synthesis of the ideas and 
issues developed at the conference. It does not provide minutes of the 
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discussions, nor does it attempt to summarize the wealth of valuable data and 
analyses published in the final report of the DDC Online Project. The latter is 
readily available in the project report's own excellent Executive Summary— a 
document that is required reading for anyone with a serious interest in using 
classification schedules online or in enhancing online catalogs. The objective of 
the following summary is to share the ideas of conference attendees with others in 
the profession and to stimulate others to continue to work toward exploiting the 
subject-rich content of classification schedules for use in online retrieval. 

BACKGROUND ON THE DDC ONLINE PROJECT AND THE CONFERENCE 

The DDC Online Project was a two-year investigation conducted by staff of 
the OCLC Office of Research under the direction of Research Scientist Karen Markey. 
With support from the Council on Library Resources and Forest Press (publisher of 
the DDC), the project studied the use of DDC class numbers and terms from the DDC 
Schedules and Relative Index in an experimental online catalog. The project sought 
the answers to three questions: 

1. Is the machine-readable DDC (a product developed from print tapes) 
suitable for implementation in an online catalog as a searcher's tool 
for subject access, browsing, and display? 

2. When used as a searcher's tool, does the online DDC improve the 
performance of subject searchers at an online catalog? 

3. Do subject searchers prefer an online catalog in which the DDC is 
available? 
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browsing, accessible by either keywords or class numbers* The project team 
thoroughly analyzed the results of SOC and DOC searches, using failure analyses to 
determine the value of each of the DDC enhancements in subject searching. They 
gathered additional evaluative information from analyses of precision and recall, 
from analyses of the unique subject content of the enhancements, and from interviews 
with SOC and DC -is, t s. Project results were disseminated in draft form at the end 
of 1985. The fv • il report was published in February 1986. 

The project's research design and findings provide a wealth of new 
information on subject searching in online catalogs and on the online use of a DDC 
Schedule database. The project's sponsors realized that the final report did not 
mark the conclusion of an investigation as much as it provided both an exciting 
stimulus and a solid base for further work. To help assess and articulate the next 
phase of development in using classification schedules online, the Council on 
Library Resources organized an invitational meeting focused on the topic. 

Cosponsored by Forest Press, OCLC, and CLR, the Conference on 
Classification Schedules as Subject Enhancement in Online Catalogs was held at OCLC 
headquarters in Dublin, Ohio, January 27-28, 1986. (A list of the 30 invited 
participants is included as Attachment A.) The conference agenda included both 
prepared presentations and group discussions. Formal presentations were made by: 
William Mischo on recent subject searching enhancements in online catalogs; Anh 
Demeyer on the construction of the Dewey On 1 ine Catalog ; Karen Markey on the results 
of the DDC Online Project; and Lois Chan on the potential of LCC as an online 
retrieval tool. In addition, Peter Paulson offered Forest Press's perspective on 
the future of DDC online and Carol Mandel summarized the conference discussions. 
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In the following summary, the substance of the presentations is synthesized along 
with the discussion they sparked. The Chan paper, which provides a valuable, well- 
organized analysis of LCC characteristics in relation to online retrieval, is 
available as a separate publication* 2 Details of the Demeyer and Markey 
presentations can be found in the DDC Online Project final report* The present 
summary is organized around the three themes that emerged during conference 
discussions: (1) subject searching enhancements for the next generation of online 
catalogs; (2) class number searches in online catalogs; and (3) classification 
schedules as online tools* 

SUBJECT SEARCHING IN THE NEXT GENERATION OF ONLINE CATALOGS 
Requirements for the current generation 

The methodology of the DDC Online investigation was carefully designed to 
assess the contributions to successful subject searches made by specific features 
of SOC and DOC. In addition to evaluating DDC enhancements, the in-depth analyses 
of subject searches confirmed the value of significant features that should be (but 
are not yet universally) incorporated into currently operational online catalogs* 
These features are enumerated below. 

1. Catalogs designed for end-user searchinp must compensate for users' 
documented inability to match exactly the phrases that form Library of 
Congress Subject Headings (LCSH) subject terms, even when the user is 
"on the right track. 11 Capitalization and punctuation must be 
normalized. Some spelling errors can be handled through implicit 
truncation when no match occurs (e.g., dropping plural endings). 
Another hedge against spelling discrepancies is an alphabetical 
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display of subject headings that file (or would file if there is no 
match) next to the term input by the searcher. Finally, keyword 
searching of subject headings and titles is essential to effective 
subject retrieval. 

Online catalogs should provide more than one subject searching 
option, including keyword and exact subject searches, call number 
searches, and browsing of subject indexes. The options are 
complementary, not redundant. Each option will contribute 
additional unique retrievals to the same information quest. Also, 
whether by temperament or need, individual search styles in any given 
setting wi 11 range from quick to methodical . The online catalog must 
satisfy a variety of approaches. 

Displays of retrieved records should be brief, easy to interpret, and 
easy to manipulate (e.g., backward and forward scrolling). No matter 
how well-framed a search strategy may be or how accurate its results, 
the effective end result of a search is the subset of records that the 
user chooses to display. When search results are small (e.g., 1-5 
hits), 95 percent of users will display all of the retrieved records. 
When retrieval results are larger, users become discouraged at 
viewing search results unless the hits are easy to browse and 
evaluate. 

Online catalogs should permit users to refine large retrieval sets. 
Techniques used successfully in many online catalogs include: (1) 
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the ability to add further Boolean arguments to the search argument 
and (2) the ability to limit results by characteristics such as date, 
language, and location. (More sophisticated techniques are 
discussed in the section on navigated searches.) 

5. Even when a subject search argument matches an indexed term, online 
catalogs should permit the searcher to choose whether to go directly 
to displays of search results or to view a list of subject terms that 
includes the matched term. The former option must be provided for the 
many searchers who are in a hurry, while the latter is necessary 
because users may not be aware that a more specific term (e.g., the 
subject plus a subdivision) exists and would better meet their needs. 

The next generation ; related topic d i sp 1 ays 

In existing online catalogs, the only available term displays are 
alphabetical, either by whole term or by keyword. Yet when users browse lists of 
subjects, they are seeking terms substantively related to their search argument. 
Even sophisticated searchers (including conference attendees) instinctively 
ponder "dumb" alphabetical listings produced by The Computer, seeking meaningful 
relationships. But such relationships have not yet been built into online 
catalogs. 

Identifying and articulating term relationships is an intellectual and 
editorial task; such relationships are recorded in thesauri and the captions of 
classification schedules. In online catalogs, related term displays could be 
devised from appropriately coded subject authority files or from another online 
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file (such as a classification schedule) integrated with the online catalog. The 
LCSH authority tapes provide a database from which to derive displays of terms that 
are broader, narrower, or related to a user's search argument. However, as 
presently constructed, LCSH cannot be used to depict topical outlines or 
hierarchies. The LC class schedule, which is enumerative rather than 
hierarchical, also does not consistently supply an easily grasped overview of a 
subject area. The DDC is a potential source of logical and topical outlines for 
online browsing. 

The DDC Online Project provided an opportunity to supply users with 
displays of topically related terms and to study users' reactions to this aid. On 
the surface, the results are disappointing, since subject outline searches led to 
the retrieval of relevant items in only 29 percent of the cases. However, further 
examination reveals that DOC's subject outline search, while a worthy first 
attempt, contained a few significant design flaws. The most notable was one that 
led users to broader topics in the subject outline by matching terms to class numbers 
and directing users to the most general (or shortest) class numbers of those 
matched. A redesigned DOC would direct users to the class number and caption with 
the highest number of postings matched during the grouping procedure. The bX 
outlines themselves also proved to be imperfect tools, sometimes lacking useful 
terminology or sufficient levels of hierarchy. 

Conferees concluded that attempts to provide users with a display of 
related topics should not be judged solely on their ability to deliver relevant 
hits. The subject outline Is not a tool for the quick and dirty searcher, but for 
the methodical user; It should be evaluated In terms of users' satisfaction with Its 
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contribution to their development of search strategies. Also, it may be a tool that 
requires user training. If it can add significantly to the success of subject 
searching for more methodical, well-trained searchers, it will be a powerful 
enhancement for on 1 ine catalogs. The DOC subject outl ine search was a valuable first 
attempt to present a display of related topics to online catalog users. It will be 
important to build on this research, formulating retrieval algorithms, displays, 
and special ized browsing capabil ities that exploit the concept/term relationships 
that can be derived from LCSH and DDC online. 

The next generation : navigated searches 

The careful failure analyses in the DDC Online investigation high 1 ighted 
users' lack of expertise in formulating search strategies. Users often initiate a 
subject search by simply guessing about an approach to their topic. In many cases 
(about one-third of the time) their search argument bears little resemblance to 
their own expressed information need; often they search using a term that is broader 
than the topic they seek. A typical example was the user who searched the term 
"Olympics" in order to find material on Russian sports. Even if a user enters an 
appropriately descriptive term, the next hurdle is matching this term to the 
indexing language of the catalog. Expert searchers (usual ly reference 1 ibrarians) 
have a bag of tricks for formulating search arguments. Can some of these tricks be 
built into online catalogs? 

Navigation aids include instructions and displays that help users refine 
searches (tools that bring out the expertise of the user), as well as transparent 
operations that try successive search strategies in response to a single command. 
(One conference participant noted that users respond to transparent assistance in 
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two ways: half the users are pleased with the results; the other half spend three 
hours attempting to understand the source of the results.) Conferees described 
some existing or experimental examples of navigated searches* The online catalog 
at the University of Illinois can use a match on title keywords to direct users to the 
first subject heading that appears on the matching title. Or it can take a user's 
keyword search, identify the subject headings most common in the retrieval set, and 
display these to the user. An experimental online catalog, OKAPI, uses the 
following transparent strategy in response to a user's subject search: attempt an 
exact phrase match; if no hit, attempt a keyword search; if no hit, search keywords 
separately and retrieve the two matches with the fewest postings. Some of these 
strategies can be built into microcomputer "front-ends" to online catalogs in the 
form of chained commands, assuming the indexes have been designed to support them. 
This approach is likely to increase in use as microcomputers are employed as 
gateways directing users to a selection of online databases, including the online 
catalog. There is a clear research agenda as various navigated search strategies 
must be tested, evaluated, and promulgated for use in online catalogs. 

CLASS NUMBER SEARCHING IN ONLINE CATALOGS 

In the libraries of antiquity or in contemporary computer systems, 
classification schemes have been fundamental to information retrieval . 
Classification schemes are not merely methods for sel f -arrangement . The 
classified catalog is the core tool of many European libraries, and even North 
American subject catalogs are alphabetico-classed— i .e. , subjects are broken down 
by subtopics within alphabetical lists. Classification schemes provide a 
different approach from subject headings, pulling together broader concepts to 
which dozens or even hundreds of separate specific topical headings might apply. 
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search term "gold" to records containing the classification number for Jewelry- 
making. It has been argued that the best way to pose a precise search 1n large 
databases 1s to seek those records that contain the combination of the most precise 
DDC number, the precise LCC number, and the most specific LCSH term. The subject 
search strategies navigated by the next generation of online catalogs should 
certainly take advantage of class numbers contained in bibliographic records. 

The online catalog also needs to improve its functionality as an online 
shelflist. Most online catalogs cannot display items in true shelf order, with 
records sorted by shelf location (e.g., locations such as "Reference" or 
"Bibliography") and book numbers correctly sorted within class areas. Thus most 
online catalogs cannot support essential library staff uses of the old 3x5 
shelflist, such as book number assignment or inventory control. Good shelflist 
displays (i.e., properly sorted; easy to browse forward and backward; call number 
and author and title information displayed together) are equally important to 
library users when the shelf itself is not accessible. The online catalog can 
supply the only "spine" browse for microforms, items in storage, or materials at 
remote sites. 

Despite the potential power as an access point, class number searches 
currently constitute only about 10 percent of all online catalog searches. 
Cochrane and Markey 3 argue that class number searching is underutilized because 
users need tools to aid them in identifying and selecting useful class numbers. In 
the DDC Online Project, class number searches on DOC led users to an outline that 
helped them interpret the meaning of numbers— a feature praised in the evaluative 
interviews. Even more important, users need to be led to desired class/call 
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would be done by a central agency for use by many libraries. The class schedule 
database used 1n the DDC Online Project 1s fairly well developed, with a separate 
record for each number and captioned range used 1n the project. Each record 
contains specific content designations for the elements relating to the class 
number, such as captions, class elsewhere notes, and examples notes. Similar 
records were also used for DDC Relative Index terms leading to class numbers used 1n 
the project. No such product yet exists for LCC. The Gaylord Company has keyed a 
word -processed file of the LCC Schedule L, Including Index, which 1s updatable and 
suitable for viewing on optical disk. (The company is examining the results of the 
project and may choose to follow it up with further work.) This LC text database 
contains no "coding" other than indentations and returns. Experience gained from 
keying the file indicates that machine manipulation of the file would, at best, 
produce only a first, very rough cut at content designation. An arduous task of 
record-by-record interpretation and editing would be required before the LCC 
Schedule could become a machine-readable authority file. 

The potential use of online classification schedules will depend, in part, 
on how they are encoded. The machine-readable databases are developing from the 
production, editing, and distribution process of the schedules. Thus the initial 
use of the computerized schedules is as an editor's and classifier's tool. Forest 
Press has used the DDC file to create an online access and editing system for 
producing the 20th edition. The Press plans to continue work to restructure the 
print tapes into a logical database and to edit the Relative Index for online 
searching. This includes eliminating ambiguous abbreviations, editing for 
consistency in indentation, and adding index terms for the many class numbers that 
lack them. The Press hopes to be able to make this database available as a 
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cataloger's support tool. The potential optical disk LCC 1s also Intended as a 
cataloger's tool, although Uwould function as an electronic reference book rather 
than as an authority file. 

The potential use of a classification schedule 1n online catalogs falls 
Into three areas: (1) as an authority file; (2) as a source of additional searching 
vocabulary; and (3) as a tool built Into the search strategy options offered by the 
catalog. Looking first at an authority file, one can see such a file already 
developing from the Forest Press DDC tapes. The tape consists of class number 
records that Include captions, notes, references to related numbers, and references 
Unking new and changed numbers with those previously used. While the Initial use 
of this tape will be as a resource file, it 1s also possible to envision Its future 
use as an authority file within an online catalog, linked to class numbers 1n 
bibliographic records and used for file maintenance and as a source of links toother 
terms, references, and explanatory information. A regular ly updated Forest Press 
product could be used to keep class numbers in catalogs current. However, unless 
libraries begin to use class number access points that are separate from call 
numbers, it is unlikely that they will want to see class numbers automatically 
"f 1 ipped" in the same manner as updated name headings. One current problemwith the 
DDC file as an authority is the lack of records for synthesized numbers that are not 
explicit in the schedule; Forest Press is currently exploring the possibilities for 
assembling and parsing these numbers by machine. Should the LCC schedules be 
developed into authority files, numbers currently built from auxiliary tables would 
present a similar problem. 
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While there 1s little (If any) current demand for online catalogs with 
classification authority f 1 les, there 1s considerable demand for augmented subject 
vocabulary In online catalogs, an enhancement that would serve both quick and 
methodical users. The DDC Online Project demonstrated that terms from the DDC 
captions and Relative Index were valuable In keyword searching. In fact, the DDC 
provided more unique terms to DOC than LCSH, although not all of these unique terms 
were good indicators of a book's content. (About 25 percent of DDC schedule terms 
were unsatisfactory as subject terms, lacking in subject content or containing 
awkward wording.) Chan's assessment of LCC indicates that it too would add 
considerable new vocabulary beyond LCSH as well as thousands of proper names (not 
necessarily in AACR2 form!). The use of keywords from class schedules is 
particularly appealing in systems that include records containing class numbers but 
lacking controlled vocabulary terms, such as catalogs derived from circulation 
systems or large databases of minimal level records. However, given what we now 
know about subject searching and the need for multiple approaches, Relative Index 
terms cannot be viewed as a substitute for subject indexing of most library 
materials. Also, terms derived from cal 1 number links will cover only one aspect of 
the subject content of a book. The DOC results do confirm that added vocabulary 
from classification schedules can help to match users' search terms to records in 
the catalog. 

The DDC Online Project has brought the DDC close to readiness for use in 
vocabulary augmentation. It identified editorial work that needs to be done to 
make the language of the captions and Index more useful, and the DDC editors have 
embarked on this work. DOC also tested strategies for moving from DDC keywords to 
bibliographic records, and the research team has identified recommended 
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rof Inements. This Is an area that merits further research, both testing and 
strategies for matching terms with numbers and, for the benefit of the many LCC 
libraries, Investigating the use of DDC terns with the DDC class numbers 1n LC MARC 
records rather than with specific call numbers. A catalog effectively using DDC 
keyword searching could be ready by the time the 20th edition 1s available on tape. 

The use of LCC for vocabulary augmentation 1s less Imminent. The LCC 
Indexes are unlntegrated and cumbersome; even 1f they were made machine-readable 
they could not be effectively Indexed and stored 1n online catalogs 1n their current 
unedited form. Hierarchical relationships between numbers and captions are 
evident 1n Indentions 1n the schedules, and these Indentions would have to be coded 
1n a machine-readable format for LCC to maintain and express relationships between 
captions and LCC numbers. Caption terminology would also require editing. Both 
LCC and DDC terms would be added to keyword indexes as uncontrolled vocabulary; such 
vocabulary brings the risk of too-large retrievals and false drops. While DOC 
demonstrated that this problem is manageable for DDC terms, similar careful 
research would be required to monitor the impact of added LCC terms on retrieval 
results. This 1s a field wide open for future research. 

Ultimately, class schedules can provide future online catalogs with 
additional tools for navigating searches. The schedules can be viewed as 
arrangements of concepts that are linked to numbers that locate books. Relatively 
few end users will initially enter class numbers into the keyboard; instead systems 
will use searchers' terms to match class numbers and use these numbers "behind the 
screen" to provide search options such as browsing subject outlines, refining 
subjects by aspect, browsing a shelf arrangement, and viewing related terms. The 
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existence of the DDC In many translations oven makes It possible to Imagine the DDC 
numbers as a switching language In a multilingual system using conns from 
translations of the captions and Index. As we look toward building expert systems 
Into online catalog searches, classification schedules appear to be a promising 
source of "expertise." 

CONCLUSION 

Conference participants agreed that the DDC Online Project will stimulate 
considerable future research and development. In general, the next broad steps 
seem to be taking shape as follows: 

1. All operational online catalogs should include the subject search 
features that have already proven necessary: e.g., normalization, 
component word searching, browse displays, a mix of subject searching 
options, search limitation, and the other features enumerated at the 
beginning of this paper. 

2. Subject searching requires multiple, complementary approaches. The 
next stage of onl ine catalog development is the exploration of subject 
search strategies : built-in sequences of arguments and algorithms 
that overcome "no hits," refine too-large retrievals, and aid in the 
selection of search terms. 

3. It is worthwhile to continue to build the DDC into a classification 
authority file, available in machine-readable form for use as a 
catalogers' tool and in online catalogs. 
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Using DOC as a prototype, It If, worthwhile to continue to refine the 
design of a DDC online catalog, learning more about the use of 
augmented vocabulary and assisted search strategies. 

Displays of related terms will be a valuable search enhancement 1n 
future online catalogs. It Is worthwhile to continue to explore the 
use of displays of related terms 1n subject searching, Improving on 
DOC's subject outlines and developing useful related term displays 
from the LCSH tapes. 

LCC will eventually be made machine-readable, at least for ease of 
S 

editing, updating, and electronic publication. As changes 1n the LCC 
editorial process are considered, they should be viewed 1n light of an 
evolution (however protracted) toward an LCC authority file. 
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